Home / About VaR Analysis Project

Model Validation Project - Backtesting a Historical VaR Model through the 2023 Banking Crisis¶


Objective:¶

To implement and backtest a 99% 1-Day Historical Simulation VaR model, analyze its performance during the March 2023 banking crisis, and connect model failures (breaches) to specific financial news events.


Things to know:¶

  • VaR: Value at Risk or VaR is a measure used to assess potential loss in the value of a portfolio over a given time period and a given level of confidence. As John C Hull adequately explains it in his book, "when using the value-at-risk measure, an analyst is interested in making a statement of the following form: 'I am X percent certain there will not be a loss of more than V dollars in the next N days.' The variable V is the VaR of the portfolio. It is a function of two parameters: the time horizon (N days) and the confidence level (X%). It is the loss level over N days that has a probability of only (100 - X)% of being exceeded."

  • How to implement VaR: The most straight-forward way to implement a VaR model is by using Historical Simulation. It means using the past data to estimate what will happen in the future.

  • For this project, historical simulation based VaR model will be implemented and back-tested during a crisis period to understand the performance of model in volatile time periods and reasons for shortfalls thereof.


Project Prompt:¶

You are a junior quant on a model validation team. Your bank uses a basic 99% 1-day Historical Simulation VaR model for preliminary risk monitoring on some of its trading portfolios. In the wake of the March 2023 collapse of Silicon Valley Bank (SVB) and Signature Bank, your manager has asked you to conduct a post-mortem analysis. Your key task is to assess whether the simple VaR model was adequate to capture the risks during the March 2023 banking crisis? Identify every day that the model failed (a 'breach') and provide a narrative explanation for each failure using evidence from financial news.


The Strategy:¶

The post-mortem analysis of the VaR model can be split into 2 distinct parts as follows:

  1. The Quantitative Analysis - This includes downloading the requisite data, constructing a hypothetical portfolio and implementing the Historical Simulation VaR model using the data. Once the model results are available, the back-test will be run on the duration of crisis, giving the results for visualisation of breaches.

  2. The News-Driven Analysis - This includes gathering financial news excerpts for the days on which breaches occured and diving deep into the news to see whether anything mentioned an event that could not be anticipated by the model and therefore, lead to its failure.


Part 1 - The Quantitative Analysis¶

  • Tools: Python, with libraries: pandas, numpy, yfinance, matplotlib, seaborn. If these libraries are not installed, run 'pip install yfinance pandas numpy matplotlib' in terminal or command prompt.

    • pandas: The primary tool for working with structured data in a spreadsheet-like format. Its DataFrame object is used.
    • numpy: The fundamental package for scientific computing. It is useful for numerical operations, especially calculating the percentile for VaR.
    • yfinance: A convenient library to download historical market data from Yahoo Finance directly into the script.
    • matplotlib.pyplot: The standard library for creating static, animated, and interactive visualizations in Python.
  • Data: Daily stock price data from Yahoo Finance (yfinance).

  • Portfolio: Equally-weighted portfolio with an assumed value of $1,000,000, comprising of the following tickers:

    • JPM (JPMorgan Chase - a large, systemic bank)
    • KRE (SPDR S&P Regional Banking ETF - direct exposure to the crisis epicenter)
    • SPY (SPDR S&P 500 ETF - broad market context)
  • Time Period:

    • Lookback Window for VaR Calculation: 252 days (approximately 1 trading year).
    • Backtesting Period: January 1, 2023, to April 30, 2023.

1.1 Importing the necessary libraries¶

When importing the necessary libraries, it is a common practice to import them with an alias (observe "import pandas as pd" in the following code excerpt). This makes it easy to call their functions later in the code block. These alias can differ based on personal preference as long as they are used consistently throughout the document.

In [1]:
# Core libraries for data manipulation and numerical operations
import pandas as pd
import numpy as np

# Library for downloading financial data
import yfinance as yf

# Library for data visualization (only 'pyplot' and 'dates' modules from matplotlib library are being imported here)
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

1.2 Downloading data from Yahoo finance¶

In [2]:
# Specifying tickers, their weights and portfolio value

tickers = ['JPM', 'KRE', 'SPY']
weights = np.array([1/3, 1/3, 1/3])
portfolio_value = 1_000_000  

# Specifying the dates for lookback window (1 trading year before the backtesting period begins)
start_date = '2022-01-01'
end_date = '2023-04-30'

# Downloading all available data for the tickers
data = yf.download(tickers, start=start_date, end=end_date)
print(data.columns)
/tmp/ipykernel_1840440/2254962435.py:12: FutureWarning: YF.download() has changed argument auto_adjust default to True
  data = yf.download(tickers, start=start_date, end=end_date)
[*********************100%***********************]  3 of 3 completed
MultiIndex([( 'Close', 'JPM'),
            ( 'Close', 'KRE'),
            ( 'Close', 'SPY'),
            (  'High', 'JPM'),
            (  'High', 'KRE'),
            (  'High', 'SPY'),
            (   'Low', 'JPM'),
            (   'Low', 'KRE'),
            (   'Low', 'SPY'),
            (  'Open', 'JPM'),
            (  'Open', 'KRE'),
            (  'Open', 'SPY'),
            ('Volume', 'JPM'),
            ('Volume', 'KRE'),
            ('Volume', 'SPY')],
           names=['Price', 'Ticker'])

Note: Adjusted Close price (published by Yahoo finance under the coloumn heading "Adj Close") reflects the stock's closing price after adjusting for corporate actions like:

  • Dividends

  • Stock splits

  • Rights offerings

It is primarily used for backtesting and historical analysis, because it gives a consistent basis for comparison across time. However, in the current case, as can be observed from the printed column-heads above, Yahoo Finance did not return 'Adj Close' for these tickers. This is because ETF/Index data (KRE, SPY, etc.) is being pulled here. Another thing to note is that yfinance sometimes drops 'Adj Close' if the adjusted prices equal the regular close (no dividends/splits).

Therefore, since "Adj Close" is not available in data for these tickers, this project will be completed using "Close" price.

1.3 Calculating log-returns of the data¶

Log returns measure a percentage change in price from one time period to the next - but on a logarithmic scale. They are preferred in quantitative finance, especially in backtesting and portfolio analysis because log-returns, unlike simple returns are more normally distributed and time-additive.

Time-additivity means, log returns are easy to add over time as observed below:

$$ \ln\left(\frac{P_3}{P_1}\right) = \ln\left(\frac{P_2}{P_1} \cdot \frac{P_3}{P_2}\right) = \ln\left(\frac{P_2}{P_1}\right) + \ln\left(\frac{P_3}{P_2}\right) $$

$$ \Rightarrow r_{1,3} = r_{1,2} + r_{2,3} $$

This is not true of simple returns.

In [3]:
# Calculating log returns using the 'Close' price
close_prices = data['Close']
log_returns = np.log(1 + close_prices.pct_change())

# Calculating the weighted average of the individual asset returns
portfolio_returns = log_returns.dot(weights)

# Drop the first row which is NaN (since there's no previous day to calculate a return)
portfolio_returns = portfolio_returns.dropna()
portfolio_returns.name = 'Portfolio Return'

print("Data downloaded and prepared.")
print(f"Portfolio returns from {portfolio_returns.index.min().date()} to {portfolio_returns.index.max().date()}")
print("\nSample of Close Prices DataFrame:")
print(close_prices.head())
print("\nSample of Final Portfolio Returns Series:")
print(portfolio_returns.head())
Data downloaded and prepared.
Portfolio returns from 2022-01-04 to 2023-04-28

Sample of Close Prices DataFrame:
Ticker             JPM        KRE         SPY
Date                                         
2022-01-03  146.291061  65.188011  454.466919
2022-01-04  151.836884  67.030655  454.314667
2022-01-05  149.060974  66.597115  445.590851
2022-01-06  150.644638  69.117195  445.172272
2022-01-07  152.137238  69.794617  443.412323

Sample of Final Portfolio Returns Series:
Date
2022-01-04    0.021583
2022-01-05   -0.014776
2022-01-06    0.015590
2022-01-07    0.005217
2022-01-10   -0.000614
Name: Portfolio Return, dtype: float64

1.4 Implementing a rolling backtest for VaR¶

Notes:

  • Since the look-back period is 1 year, the number of trading days therein would be 252.
  • In context of VaR, α (alpha) represents the probability of a loss exceeding the VaR threshold — i.e., the left-tail area under the return distribution. It is calculated as: $$ α=1−confidence level $$
  • For each day, VaR threshold is be calculated using the returns of last 252 days before the current day. This is done by slicing the historical window in each iteration of the loop.
  • VaR threshold is the value below which 1% of the returns (since the confidence level is 99%) in the historical window fall. It is a standard convention to multiply the resulting number by -1 and express the loss as a positive number.
  • At the end of the loop, calculated VaR and actual return with their corresponding date is stored in dataframe for easy further analysis.
In [4]:
# Specifying the backtesting parameters
lookback_days = 252  
confidence_level = 0.99
alpha = 1 - confidence_level         

# Initiating a list to store the results
results = []

# Iterating over the lookback window, the loop starts at index 252 and not 0 so that there is full 252 days of prior data to look at on the very first iteration as well.
for i in range(lookback_days, len(portfolio_returns)):
    
    # Creating a slice of the last 252 returns before the current day
    historical_window = portfolio_returns.iloc[i - lookback_days : i]
    
    # Calculating VaR
    var_99 = -np.percentile(historical_window, 100 * alpha)
    
    # Fetching the actual return for the current day 
    actual_return = portfolio_returns.iloc[i]
    
    # Appending the results (date, VaR, actual return)
    results.append({
        'Date': portfolio_returns.index[i],
        'VaR_99': var_99,
        'Actual_Return': actual_return
    })

# Converting the list of results into a pandas DataFrame
results_df = pd.DataFrame(results)
results_df.set_index('Date', inplace=True)

print("Backtest complete. Results DataFrame created.","\n")
print(results_df.head())
Backtest complete. Results DataFrame created. 

              VaR_99  Actual_Return
Date                               
2023-01-05  0.036237      -0.010276
2023-01-06  0.036237       0.023506
2023-01-09  0.036237      -0.004598
2023-01-10  0.036237       0.007827
2023-01-11  0.036237       0.008498

1.5 Identifying the breaches¶

  • A breach occurs when the actual loss on a given day exceeds the predicted VaR. Since VaR and actual return for each day are already available in a dataframe, all that needs to be done now is to compare the two values and find out where the actual return is more negative than the -VaR value.

  • Number of breaches and breach rate are also calculated along with expected number of breaches (i.e. Total days * α). If the actual number of breaches is significantly higher than the expected number of breaches, it's a strong sign the model is underestimating risk. In this case, the actual number of breaches will be higher due to the presence of a crisis.

  • To calculate the number of breaches, .sum() is used because in Python, True = 1, False = 0. So .sum() adds up the number of True values i.e. the number of breaches in this case.

In [5]:
# Comparing the values and clreating a boolean column for breach which is "True" if the actual loss was more than predicted VaR and "False" otherwise.
results_df['Breach'] = results_df['Actual_Return'] < -results_df['VaR_99']

# Calculating and printing the results
num_breaches = results_df['Breach'].sum()
total_days = len(results_df)
breach_rate = num_breaches / total_days

print(f"Backtesting Period: {results_df.index.min().date()} to {results_df.index.max().date()}")
print(f"Total Trading Days: {total_days}")
print(f"Number of Breaches: {num_breaches}")
print(f"Breach Rate: {breach_rate:.2%}")
print(f"Expected Breaches at 99% Confidence: {total_days * alpha:.2f}")

# Filters only the rows where 'Breach' == True'
breach_days = results_df[results_df['Breach']]

# Displaying the specific days where the model failed
print("\nBreach Days: ")
print(breach_days)
Backtesting Period: 2023-01-05 to 2023-04-28
Total Trading Days: 79
Number of Breaches: 3
Breach Rate: 3.80%
Expected Breaches at 99% Confidence: 0.79

Breach Days: 
              VaR_99  Actual_Return  Breach
Date                                       
2023-03-09  0.033380      -0.052956    True
2023-03-13  0.034967      -0.050304    True
2023-03-17  0.036566      -0.037364    True

1.6 Visualizing the results¶

In the graph, actual returns are plotted aganinst the VaR estimates and breaches are highlighted for comprehensive presentation.

In [6]:
# Specifying the figure size
plt.figure(figsize=(15, 7))

# Plotting Actual Returns and VaR
plt.plot(results_df.index, results_df['Actual_Return'], 'b', label='Actual Portfolio Return')
plt.plot(results_df.index, -results_df['VaR_99'], 'r--', label='-VaR (99%)')

# Marking the breach days
breach_dates = breach_days.index
breach_values = breach_days['Actual_Return']
plt.scatter(breach_dates, breach_values, color='r', marker='o', s=100, label='Breach Event')

# Formatting the plot
plt.title('VaR Backtest: Portfolio Returns vs. 99% Historical VaR', fontsize=16)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Daily Return', fontsize=12)
plt.legend()
plt.tight_layout()

# Displaying the plot
plt.show()
No description has been provided for this image

Note:¶

This sums up the quantitative analysis for this project. The code has provided with the "what" and the "when" parts of the story. The next step is to search for the "why". The data have specified the exact days on which breaches occured and these shall be the ones to be investigated further in qualitative analysis part of this project.


Part 2 - The Qualitative News Analysis¶

  • Tools: Python, with additional libraries: os, nltk, requests, google-generativeai. If these libraries are not installed, run 'pip install requests nltk google-generativeai' in terminal or command prompt.

    • requests: A simple yet powerful library for making HTTP requests to websites and APIs. This is how the script will "talk" to the NewsAPI.
    • nltk: The Natural Language Toolkit — A comprehensive library for working with human language data. It is used here for sentiment analysis.
    • google-generativeai: The official Python library provided by Google to interact with their generative AI models, like Gemini.
    • nltk.download(...): The NLTK library is modular. The sentiment analysis tool that will be used - VADER - requires a pre-trained dictionary of words and their sentiment scores (the "lexicon"). This command downloads that specific data.

2.1 Importing necessary libraries and modules¶

In [7]:
# Importing the libraries for data analysis, sending API requests and interacting with AI models
import pandas as pd
import requests
import os
from datetime import datetime
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import google.generativeai as genai

# Downloading lexicon for the Vader
nltk.download('vader_lexicon')
/home/latika/base_scripts/venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /home/latika/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!
Out[7]:
True

2.2 API Key configuration¶

  • API keys are like passwords for programs — they are unique codes used to identify and authenticate a user or application trying to access an API (Application Programming Interface). It's essentially a secret code that allows applications to interact with another application's features or data.
  • For this project, two API keys are required:
    • MarketAux API Key - To access financial news data from the MarketAux website. It can be obtained by registering on https://www.marketaux.com/ > Get FREE API Key
    • Gemini API Key - To interact with Google's AI model. It can be obtained by signing in https://aistudio.google.com/prompts/new_chat with Google account > Get API Key > Create API Key.
  • As a safety measure, the API keys have been replaced by placeholders here.
In [ ]:
# Replace the placeholder text with actual API keys.
MARKETAUX_API_KEY = "YOUR MARKETAUX API KEY HERE"
GEMINI_API_KEY = "YOUR GOOGLE AI STUDIO API KEY HERE"

2.2.1 Pro-Active debugging Step¶

Following are the action points that came up during debugging. These are mentioned here to help the user in avoiding similar issues when implementing this project.

  • During the initial few runs, the program went into multiple errors when trying to interact with Google's AI model. It was due to a difference in the versions of models available for use.
  • This step gives a list of all available models for the given API Key, so the user can pro-actively choose which one to use for the project.
  • It was also helpful to read the documentation about the API since it revealed that some of the models, even though listed here, were being deprecated.
  • It is also important to be mindful of the rate-limits. The limit of RPM or Requests Per Minute exists to ensure fair usage (e.g., perhaps 15 requests per minute). So, if the code is efficient, say, runs a for loop with 1 API request per iteration, it might be sending multiple requests to the API almost instantly. This has an easy fix by adding a time delay in the loop. In this project, gemini-2.5-flash model is used and a time delay of 15 seconds was deliberately added to respect the API's RPM limit for that model.
In [15]:
# This is a proactive step to be aware of all the models that can be accessed using the given API Key.
print("--- Available Gemini Models ---")
try:
    genai.configure(api_key=GEMINI_API_KEY)
    for m in genai.list_models():
      if 'generateContent' in m.supported_generation_methods:
        print(m.name)
except Exception as e:
    print(f"Could not list models. Error: {e}")
--- Available Gemini Models ---
models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-flash-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-04-17
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash
models/gemini-2.5-flash-preview-04-17-thinking
models/gemini-2.5-flash-lite-preview-06-17
models/gemini-2.5-pro-preview-05-06
models/gemini-2.5-pro-preview-06-05
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-preview-image-generation
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemini-2.0-flash-thinking-exp
models/gemini-2.0-flash-thinking-exp-1219
models/gemini-2.5-flash-preview-tts
models/gemini-2.5-pro-preview-tts
models/learnlm-2.0-flash-experimental
models/gemma-3-1b-it
models/gemma-3-4b-it
models/gemma-3-12b-it
models/gemma-3-27b-it
models/gemma-3n-e4b-it
models/gemma-3n-e2b-it

2.3 Defining functions¶

This part includes the reusable functions that will be helpful in doing the news analysis.

2.3.1 get_financial_news(...):¶

  • Purpose: This function's sole job is to fetch news headlines for a given date and set of keywords.
  • URL Construction: It builds the specific web address (URL) to send to the MarketAux API. Each part of the URL is a parameter that refines the search (e.g., search query, 'published on' date, language=en for English, and limit). It was useful to read the API documentation and learn about how to best construct a search query and be aware of the limit for specific plan being used.
  • try...except Block: This is for error handling. The internet isn't always reliable. If the request fails (e.g., no internet connection, API server is down), this block "catches" the error and prints a helpful message instead of crashing the script.
  • JSON Parsing: APIs typically return data in a format called JSON. response.json() converts this raw text data into a Python dictionary, which is easy to work with. .get('data', []) is a safe way to pull out the list of articles, returning an empty list if none are found.
  • Return Value: It returns a simple Python list of headline strings.
In [17]:
# Defining the function to Fetch News 
def get_financial_news(date_str: str, keywords: list) -> list:
    """Fetches financial news headlines for a specific date from MarketAux API."""
    
    search_query = " | ".join(f'"{term}"' for term in keywords)
    url = (f"https://api.marketaux.com/v1/news/all?"
           f"search={search_query}"
           f"&language=en"
           f"&published_on={date_str}"
           f"&limit=3"  
           f"&api_token={MARKETAUX_API_KEY}")
    
    # Using a try-except block to avoid the program from crashing entirely
    try:
        response = requests.get(url)
        response.raise_for_status()
        
        # Extracting the data from JSON format
        articles = response.json().get('data', [])
        return [article['title'] for article in articles]
    
    except requests.exceptions.RequestException as e:
        print(f"Error fetching news from MarketAux: {e}")
        return []

2.3.2 get_sentiment_score(...):¶

  • Purpose: To calculate a single number representing the overall "mood" of the news.
  • VADER: It uses NLTK's SentimentIntensityAnalyzer. VADER is great for short texts like headlines because it's tuned to social media language and common expressions.
  • Compound Score: For each headline, VADER calculates a compound score ranging from -1 (extremely negative) to +1 (extremely positive), with 0 being neutral.
  • Return Value: The function returns the average compound score of all headlines for that day, giving a single, comparable metric.
In [11]:
# Defining the function for Sentiment Analysis 
def get_sentiment_score(headlines: list) -> float:
    """Performs sentiment analysis on headlines and returns the average compound score."""
    
    if not headlines: return 0.0
    sia = SentimentIntensityAnalyzer()
    
    # Calculating the compound score for each headline, summing them up to calculate the average coumpound score
    total_compound_score = sum(sia.polarity_scores(headline)['compound'] for headline in headlines)
    avg_compound_score = total_compound_score / len(headlines)
    return avg_compound_score

2.3.3 generate_news_summary(...):¶

  • Purpose: This is the most advanced function and acts as the AI financial analyst.
  • Prompt Engineering: This is the art of giving clear instructions to an AI. A detailed "prompt" is required to tell the Gemini model:
    • Its Role: "You are a professional financial analyst..."
    • The Context: The exact date, the VaR prediction, and the actual loss.
    • The Raw Data: The list of headlines that were fetched.
    • Its Task: A very specific instruction to "synthesize the information into a coherent narrative" and "explain the 'why'." This is crucial for getting a high-quality, analytical summary instead of a simple list.
  • API Call: model.generate_content(prompt) sends the entire set of instructions to the Gemini API, which processes it and generates the text.
In [18]:
# Defining the function to Generate AI Summary 
def generate_news_summary(headlines: list, date_str: str, var_value: float, actual_loss: float) -> str:
    """Uses Google's Gemini LLM to generate a 'News-Driven Cause' summary."""
    
    if not headlines: return "No news headlines were found for this date to generate a summary."

    # Formatting the headlines into a clean, bulleted list for the AI
    headline_str = "\n".join(f"- {h}" for h in headlines)
    
    # Specifying a detailed instruction set (the "prompt") for the AI
    prompt = f"""
    You are a professional financial analyst writing a risk report. Your task is to explain why a significant market loss occurred on a specific date, based on the news headlines from that day.

    **Context:**
    - **Date of Breach:** {date_str}
    - **Predicted Max Loss (VaR):** {var_value:.2f}%
    - **Actual Loss:** {actual_loss:.2f}%

    **News Headlines from that Day:**
    {headline_str}

    **Your Task:**
    Analyze these headlines and write a concise, one-paragraph explanation (the "News-Driven Cause") for the market drop. Synthesize the information into a coherent narrative. Focus on the most impactful events and entities (like SVB, Credit Suisse, etc.) if they are prominent. Explain the 'why'.
    """
    
    # Enclosing the tasks in a try-except block to avoid crashing of the entire program.
    try:
        model = genai.GenerativeModel('gemini-2.5-flash')
        response = model.generate_content(prompt)
        return response.text.strip()
    except Exception as e:
        return f"An error occurred while generating the summary with the LLM: {e}"

2.4 Running the automated analysis¶

  • The breach_days DataFrame that was created in Part 1 can now be used to perform the analysis.
  • Data Extraction: Inside the loop, the date is pulled out and converted into YYYY-MM-DD string that the API requires.
  • search_keywords: For the SVB crisis, terms like 'SVB', 'KRE', and 'banking crisis' are highly relevant and thus specified in this list. n prints the final, formatted report for each breach day it finds.
In [13]:
# Importing the 'time' library to add pauses
import time

# Loop through each row in the `breach_days` DataFrame
for index, row in breach_days.iterrows():
    
    # Extracting the data from the row. The date is the index, so `row.name` is used.
    breach_date_dt = row.name
    breach_date_str = breach_date_dt.strftime('%Y-%m-%d')
    var_value = row['VaR_99'] * 100
    actual_loss = abs(row['Actual_Return']) * 100

    # Print a header for the current day's analysis
    print("==========================================================")
    print(f"Analyzing Breach on: {breach_date_dt.strftime('%B %d, %Y')}")
    print("==========================================================")

    # Defining keywords and fetching news
    search_keywords = ["SIVB", "SBNY", "FRC", "CS", "KRE", "banking crisis", "bank run", "contagion", "Silicon Valley Bank", "selling bonds", "S&P"]
    headlines = get_financial_news(breach_date_str, search_keywords)
    
    if not headlines:
        print("No relevant news found for the given keywords on this date.\n")
        # Skip to the next date in the loop
        continue 
        
    print(f"Found {len(headlines)} relevant headlines.")

    # Calculating the sentiment Score
    sentiment_score = get_sentiment_score(headlines)
    
    # Generating summary with LLM
    news_cause = generate_news_summary(headlines, breach_date_str, var_value, actual_loss)

    # Print the final, formatted output for this day's report
    print("\n*** Model Result & News-Driven Cause ***\n")
    print(f"*   **Model Result:** VaR predicted a max loss of {var_value:.2f}%; the actual loss was {actual_loss:.2f}%.")
    print(f"*   **News Sentiment Score:** {sentiment_score:.3f} (A score below -0.1 is typically negative)")
    print(f"*   **News-Driven Cause:** {news_cause}")
    print("\n")
    
    # Adding a short pause to respect the API's rate limit (requests per minute). This prevents the 429 "Too Many Requests" error.
    time.sleep(15)
==========================================================
Analyzing Breach on: March 09, 2023
==========================================================
Found 3 relevant headlines.

*** Model Result & News-Driven Cause ***

*   **Model Result:** VaR predicted a max loss of 3.34%; the actual loss was 5.30%.
*   **News Sentiment Score:** -0.507 (A score below -0.1 is typically negative)
*   **News-Driven Cause:** On March 9, 2023, the market experienced a significant downturn, with actual losses exceeding the predicted VaR, primarily due to an acute crisis of confidence in the banking sector. News headlines detailed Silicon Valley Bank's (SVB) "sudden liquidity crisis," which resulted in a "record 60% crash" in its stock and caused "bank stocks [to] crater" broadly. This severe distress at SVB sent a stark "warning sign" across the financial system, leading to widespread contagion and fear, evidenced by other institutions like Signature Bank experiencing stock declines despite claims of financial strength. The rapid and systemic erosion of trust in banks, triggered by SVB's woes, explains the amplified market losses beyond expectations.


==========================================================
Analyzing Breach on: March 13, 2023
==========================================================
Found 3 relevant headlines.

*** Model Result & News-Driven Cause ***

*   **Model Result:** VaR predicted a max loss of 3.50%; the actual loss was 5.03%.
*   **News Sentiment Score:** -0.640 (A score below -0.1 is typically negative)
*   **News-Driven Cause:** The significant market loss on March 13, 2023, which exceeded the predicted VaR, was primarily driven by a sharp resurgence in regional bank contagion fears, despite initial signs of fading anxiety from the Silicon Valley Bank (SVB) crash. While US stock futures initially rose on hopes of containing the SVB fallout, this optimism was quickly overwhelmed by the dramatic 70% plunge in First Republic Bank (FRC) shares. This severe decline, compounded by an analyst downgrade from RayJay highlighting downside risks to FRC's earnings, signaled that the systemic "regional bank worry" had not abated but rather intensified and spread, leading to a broader market sell-off that pushed actual losses beyond expected limits.


==========================================================
Analyzing Breach on: March 17, 2023
==========================================================
Found 3 relevant headlines.

*** Model Result & News-Driven Cause ***

*   **Model Result:** VaR predicted a max loss of 3.66%; the actual loss was 3.74%.
*   **News Sentiment Score:** -0.412 (A score below -0.1 is typically negative)
*   **News-Driven Cause:** On March 17, 2023, the actual market loss of 3.74% notably exceeded the predicted Value at Risk (VaR) of 3.66%, primarily driven by widespread and escalating anxieties within the financial sector. News headlines from the day reflected a profound crisis, articulating that "Banks in Danger" and directly questioning if Wall Street analysts were "ignoring the banking collapse." This pervasive sense of systemic vulnerability, amplified by reports of "Short Sellers Post Profits of $3.5 Billion on Banks’ Woes," signaled a significant erosion of investor confidence in the stability of financial institutions, leading to a broad market sell-off as concerns over contagion and underlying weakness permeated sentiment.


Concluding remarks:¶

Looking into the news analyses, the sentiment score for all the 3 breach days is typically negative. Further, an LLM-driven summary of financial news reveals that the market was responding to the events in an unprecedented manner. The Historical Simulation VaR model failed because it is inherently backward-looking. Its risk estimate on March 9th was based on the relatively calm preceding year and contained no information about the possibility of a sudden bank run which would make the headlines for days to come.

This whole exercise proves the critical need for forward-looking risk tools like Stress Testing and Scenario Analysis to supplement daily VaR models.